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COMPOSITIONS AND METHODS FOR PROTEIN 
PURIFICATION BASED ON A METAL ION AFFINITY SITE 
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BACKGROUND OF THE INVENTION 

Cross Reference to Related Applications 

This application claims benefit of non-provisional 
15 application US Serial number 09/078.687 filed May 14, 1998. 

Field of the Invention 

This invention relates generally to the field of protein 
chemistry. Specifically, the present invention relates to protein 
20 purification using a metal ion affinity site. 

Background of the Invention: 

Development of protocols for the isolation and 
purification of proteins is often a long and costly process. Such 
25 protocols usually contain multiple steps, where some of the steps 
have recoveries as low as 50%. Further, due to variation between 
protein molecules, a purification protocol developed and effective 
for the purification of one protein is not necessarily useful for the 
purification of another. In fact, in most cases considerable 

1 
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adaptations must be made to a purification protocol to 
accommodate the various physical and chemical characteristics of 

different proteins. 

The ability to prepare hybrid genes by genetic 
engineering technology has opened up new possibilities for the 
purification of proteins. For example, one can link a DNA 
sequence of a protein of interest to a nucleic acid sequence which 
codes for a peptide which has a high binding affinity for a specific 
ligand. The fusion protein product resulting from expression of 
this DNA has attributes of both the protein of interest and the 
high affinity peptide. To purify or immobilize the engineered 
fusion protein, the ligand commonly is linked to a support, and the 
unpurified, engineered protein is then exposed to the 
ligand/support composite and allowed to bind. 

There are numerous advantages of using a high 
affinity fusion protein. For example, the use of an affinity peptide 
ensures that no part of the native protein of interest is involved in 
adsorption— the binding between the fusion protein and the ligand. 
At the same time, extremely high selectivity in the adsorption 

20 process is achieved. 

Immobilized Metal Ion Affinity Chromatography 
(IMAC) is one of the most frequently used techniques for 
purification of fusion proteins containing affinity sites for metal 
ions. Proper choice of immobilized metal ion, loading conditions 

25 and elution conditions can result in protein purification of up to 
about 95-98% in a single chromatographic step. Moreover, 
recovery generally is higher than 85%. In addition to the 
advantages discussed above, incorporation of a proteolytic, 
chemical, or enzymatic cleavage site into the composite DNA, 
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between the affinity peptide and the sequence of the protein of 
interest, provides a means for cleaving the affinity peptide from 
the protein of interest to yield the native protein of interest in 

highly purified form. 

5 The following publications are representative of the 

art: Itakura, et al., Science 198:1056-63 (1977); Germino, et al., 
PNAS USA 80:6848-52 (1983); Nilsson et al., Nucleic Acid Res. 
13:1151-62 (1985); Smith et al., Gene 32:321-27 (1984); Dobeli, et 
al., U.S. Pat. No. 5,284,933; and Dobeli, et al., U.S. Pat. No. 

10 5,310,663. 

The prior art is deficieint in improved compositions 
and methods for affinity immobilization and purification of 
proteins. This invention fulfills this long-felt need in the art. 
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The present invention relates to compositions and 
methods for protein purification involving the use of novel, 
genetically-engineered fusion proteins. These fusion proteins are 
engineered to allow for immobilization and purification via the 
high affinity interaction of an affinity peptide of a fusion protein 
with a ligand. The affinity peptide is a histidine-rich polypeptide 
sequence with a general sequence: (HX n ) m , wherein H is histidine, 
X is an amino acid other than histidine, n= 1-8, m= 2-30, and 
wherein if n=l for more than two adjacent units of HX, at least one 
X must be asparagine, phenylalanine, tryptophan, tyrosine, lysine, 
methionine, arginine, glutamine, or cysteine. The affinity peptide 
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is linked <o the proteins of interest Rl and R2 to yield a fusion 
protein with formula Rl-<HX„) m -R2. In a preferred embodiment of 
the invention, n-1-4 and m=2-10. In a more preferred 
embodiment of the invention, n=l-4 and m=3-6. In a speeif.e 
5 embodiment of the invention, a fusion protein hav.ng the 
sequence SLKDHLIHNVHKEEHAHAHNKISWGVGAVGM (SEQ ID 
No-6) is provided. In another embodiment of the present aspec, 
of ,he mvention, at least one protease cleavage site is inserted 
between the sequence of the protein of interest and the sequence 

10 of the affinity peptide. 

In another aspect of the invention, there is provided a 
DNA sequence coding for a fusion protein comprising a protein of 
interest fused at its amino-terminus or carboxy-terminus to at 
leas, one affinity peptide, where the fusion protein has the 
,5 general formu.a R 1 -(HX„ )n ,-R2, wherein Rl or R2 is the prote.n of 
interest, H is histidine, X is an amino acid other than histidine, n- 
,.g m = 2-30, and wherein if n=l for more than two adjacent 
units of HX, at leas, one X must be asparagine, phenylalantne, 
tryptophan, tyrosine, lysine, methionine, arginine, glutamine, or 
20 cysteine. In a specific embodiment of this aspect of the invent.on, 
there is provided a DNA sequence which codes for a protein where 
the fusion protein has the sequence 

SLKDHLIHNVHKEEHAHAHNKISVVGVGAVGM (SEQ ID No:6). 

In various embodiments of this aspect of the 
25 invention, there is provided a recombinant vector comprising an 
expression vector and a DNA sequence coding for a fusion pro.etn 
comprising a protein of interest fused a, its amino-termmus or 
carboxy-terminus to a. leas, one affinity peptide as described 
above, wherein ,he recombinan, vector is capable of directtng 
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expression of the DNA sequence in a suitable host organism. The 
present invention also provides a host organism containing a 
recombinant expression vector comprising a DNA sequence coding 
for a fusion protein comprising a protein of interest fused at its 
s ammo-terminus or carboxy-terminus to at least one affinity 
peptide as described above, wherein the organism is capable of 
expressing said DNA sequence. 

In yet an additional aspect of this invention, there is 
provided a method for purifying the novel fusion proteins of the 
,0 present invention, comprising the steps of: contacting a protein 
sample containing the fusion protein in a mixture with other 
proteins with a metal chelate resin under conditions where the 
fusion protein binds to the resin to produce a resin-fusion protein 
complex; washing the resin-fusion protein complex with a buffer 
15 to remove the other, unbound proteins; and eluting the bound 
fusion protein from the washed resin-fusion protein complex. One 
embodiment of this method includes inserting at least one 
protease cleavage site between the protein of interest and the 
affinity peptide, and cleaving the protein of interest from the 
20 affinity peptide after purification using the metal chelate resin. 

Other and further aspects, embodiments, features and 
advantages of the present invention will be apparent from the 
following description of the invention. 



25 

BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 is a schematic representation of the 
pGFPuv/HAT vector. This vector contains one embodiment of the 
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affinity peptide of the present invention fnsed to the N-.erminus 
of . UV mutant of Green F,norescen. Prorein. Only un. q ue 

restriction sites are noted. 

Figure 2 is a sehematic representation of the vector 

„UC19/HS) containing par, of the affinity peptide (AP) a, the N- 
Lminus of Enterokinase (EK) cleavage site followed by mu.ttp.e 

Coning site (MCS). Oniy restriction sites are denoted. 

Figure 3 is a schematic representat.on of 

restriction maps of a vector with three frame shifts containing 

part of the affinity peptide and enterokinase e.eavage sue that 

used for expression of recombinant proteins. 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to compositions and 
methods for purification of novei, genetically-engineered fusron 
proteins. Immobilization and purification is achieved v.a a h.gh- 
a finity interaction of an affinity peptide portion o, the fuston 

polypeptide with a genera, sequence (HX„)„ which ,s l.nked 
proteins of interest Rl or R2; wherein H is histidine. X is an am.no 
acid other than histidine. n= 1-8, m= 2-30, and wherein tf n= fo 
more than two adjacent units of HX. at .east one X must be 
asparagine, phenylalanine, tryptophan, tyrostne, ly m e, 
methionine, arginine, g.ntamine, or cysteine. Th.s htgh aff.n «y 
p 0lyP eptide is incorporated by fusion to the N- or C- ermtn 1 
l,„ence of a protein of interest and is used for high selecuvtty 
purification of the fusion protein. 
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The affinity of the high affinity peptide is for 
mobilized tneta. ion, The strength of binding between .he high 
affinity pept.de and an appropriate meta. ion is very high; <hus, 
isol afion of the fusion pro.eins is very selective. However, 
association between the peptide and .he .igand is also revers.b.e^ 
Once the fusion protein has been al.owed to associate or adsorb 
with the meta! ion ligand, the protein can be disassociated or 
eluted from the metal ion/adsorbent by addition of compe.it.ve 
lig and such as imidazole, or by decreasing the pH, which leads to 
pronation of .he nitrogen in the imidazole ring of the hist.d.ne 
side chain and release of the adsorbed protein. Because of th.s 
r eversibi.i.y. the protein is recovered in a purified, unbound form. 
Further, regeneration and reuse of the meta. ion/adsorbent or 
support multiple times-even more than .00 times-is poss.b.e. 

An additional feature of the protein pnrificauon and 
mobilization techniques based on the principles of the present 
invention is the high probabiUty that the purified and regenerated 
pr „tein of interest win retain full bio.ogica, activity and 
specificity. This is because the affinity peptide is involved .» th 

,k j.„„ nrocess where the portion of the fusion 
>0 immobilization/binding process wue 

protein tha, contains the protein of interest is not. 

Incorporation of a proteolytic site between the h.gh 
affinity peptide and the sequence of .he protein of interest 
provides the means to regenerate .he protein of interest from the 
25 Lion protein. Regeneration is achieved by limited proteolysis o 
th e fusion protein and a second chromatography step .n wh.ch 
proteoiytic product is passed through an immobilized meta. .on 
affi „ it y column, .ndeed. one can utilize the same column as wa 
US ed to immobilize and punfy the fusion protein. .n the second 
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chromatography step blowing proteo.ysis, the cleaved 

^mobilization or adsorption, whereas .he high affrntty pept.de ,s 

adsorbed on the column. .... 

One embodiment of the present inventron features 

corporation o, nucleic acid sequences which code for secretion 
signals into the DNA sequence that codes for the fusion pro.e, . 
Sucn secretion signais cause the fusion protein to be secreted ,n«o 
, he media after synthesis in a hos, cell. Since a cons.derable 
.mount of tota, cellule, protein remains in the cell, secreuon 
impr „ves dramatically the iso.adon and purification of the fus.on 
prolcin hy e.iminating the need for cell disruption, prote.n 
Lraction. and/or remova. of unwanted cellular components and 

affinity peptide" refer to a histidine-rich polypeptide wi* . 
genera, sequence (HX.). which is linked to protetns of tnteres, Rl 

. -v ic an amino acid other than 

or R2; wherein H is histidme, X is an ammo 

1 8 m- 2-30 and wherein if n=l for more than two 
h stidine, n= m ~ z 

t of HX at least one X must be asparagine, 
►0 adjacent units of HA, ai ica S 

♦ v,o« rvrosine lysine, methionine, arginine, 
phenylalanine, tryptophan, tyrosine, iysi 

glutamine, or cysteine. 

As used herein, the term "protein of interest shall 

25 purpose of purification or immobilization. 

As used herein, the term "fusion protein" shall refer to 
lhe protein hybrid containing the affinity peptide and the protein 
or interest or any amino acid se.uence of interest. The ^ fusion 
protein has the general formula Rl-(HX n ) m -R2; wherein Rl or R2 
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the protein of interest. H is histidine, X is an amino acid other than 

r mt at least one X must be asparagtne, 

adjacent units of HX, at least 

. >.,„ tvmcine lvsine, methionine, arginine, 

phenylalanine, tryptophan, tyrosine, lysin 

glutamine, or cysteine. 

As used herein, the terms "secretion sequence or 
"secretion signa. sequence" sha.l refer to an amino acid signal 
sequence which leads to die transport of a protein contaimng the 
signal sequence outside ,he cel. membrane. In the present case, a 
, fusion prorein o, the present invention may contain such a 
secretion sequence to enhance and simplify punftcauo. 
Representative examples of secretion signa. sequences are well 
known to those having ordinary skill in mis art. 

As used herein, the term "proteolytic site" shall refer 
15 to any amino acid sequence recognized by any proteolytic enzyme. 
In the present case, a fusion prorein of rhe present invention may 
contain such a proteose site between the protein of interest and 
th e affinity peptide and/or Cher amino acid sequences so tha the 
prot ein of interest may be separated easily from these 
20 heterologous amino acid sequences. 

As used herein, the term "metal ion" refers .o any 
me tal ion for which ,he affinity peptide has affinity and that can 
be used for purification or immobilization of a fusion protein. 

As used herein, the terms "adsorbent" or solid 
support" shall refer to a chromatography or immobilization 
medium used to immobilize a metal ion. 

As used herein, the term "regeneration", in the context 

. i„ shall refer to the process of separating or 
of the fusion protein, shall reier 10 v 
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eliminating the affinity peptide and other heterologous amino acid 
sequences from the fusion protein to render the protein of 

interest after purification. 

So that the matter in which the above-recited features, 
advantages, and objects of the invention become clear and can be 
understood in detail, particular descriptions of the invention may 
be had by reference to particular embodiments described in the 
Examples below; however, the following description and examples - 
are given for the purpose of illustrating various, specific 
embodiments of the invention, and are not meant to limit the 
scope of the invention in any fashion. 
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EXAMPLE 1 

Extraction of l actate den^drogeilMg from chicken breast muscle 

A naturally-occurring peptide sequence from the N- 
terminus of lactate dehydrogenase (LDH) from chicken muscle 
(Callus gallus) was used for initial experiments. The protein 
includes a stretch of approximately 30 amino acids which has a 
sequence consistent with the general formula of the fusion protein 
of the present invention 

(SLKDHLIHNVHKEEHAHAHNKISVVGVGAVGM (SEQ ID No:6)). 
Further, LDH has the feature that the enzyme itself can be assayed 
easily for activity. Thus, the naturally-occurring chicken muscle 
LDH served as a "fusion protein" for these experiments in the 
sense that it contained both a high affinity peptide and a protein 
of interest. 
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Extraction of chicken breast muscle LDH was 
performed by cutting 15 g of frozen chicken muscle, free of blood 
vessels, into small pieces and transferring the material to a 
commercial blender along with 150 mL of extraction buffer (50 
mM sodium phosphate, 1 mM EDTA, 1 mM magnesium acetate pH 
7.5, 1 mM 2-mercaptoethanol (0.2 L) stored for at least 3 0 
minutes at 4°C). The mixture was homogenized twice at 4°C for 3 0 
seconds, with a 10-minute pause between the bursts. After the 
second homogenization, the mixture was transferred to centrifuge 
tubes and centrifuged at 4°C and 10,000 x g for 30 minutes. The 
clear supernatant was collected and used as a starting sample for 
the purification of lactate dehydrogenase. 



15 



EXAMPLE 2 
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p- ri-...^ at l asttE dstotogsiuss on HiaifcdMlUtM 

c^ph, arose FF 

Lactate dehydrogenase was purified by IMAC in the 
following manner: approximately 5 mL of Chelating Sepharose FF 
(Amersham, Pharmacia) was transferred to a vacuum bottle, 
diluted with an equal volume of deionized water and degassed 
under vacuum for 10 minutes. The gel suspension was poured 
into a column (10x1 cm. i. d.) trapped on the bottom with a 
degassed adapter and left to settle. The column was then filled to 
the top with degassed deionized water, and a top adapter was 
gently pushed down the column bed until there was no space 
between the top surface of the gel and the adapter. The column 
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was washed with 3 column volumes of deionized water at a flow 

rate of 0.5 mL per min. 

The chicken muscle extract (14 mL) was equilibrated 
by gel filtration on Sephadex G-25 columns with equilibration 
5 buffer (20 mM sodium phosphate buffer containing 1.0 M sodium 
chloride and 0.06 M imidazole P H 7.0 (1 L». The IMAC column 
was then charged with Ni(II) ions using 20 mL of a 0.02M 
Ni(N0 3 ) 2 solution. The excess metal was washed from the column 
with deionized water at a flow rate of 0.5 mL per minute and the 
10 column was then equilibrated with 5-10 volumes of equilibration 
buffer (20 mM sodium phosphate buffer containing 1.0 M sodium 
chloride and 0.06 M imidazole pH 7.0 (1 L)). 

The IMAC column was prepared by loading the 
equilibrated extract on to the IMAC column at a flow rate of 0.5 
15 mL per min. Fractions of 1 mL were collected. The column was 
washed with equilibration buffer until a baseline was reached 
(absorbance of the fractions at 280 nm was less than 2 mAU 
higher than the absorbance of the equilibration buffer). The 
adsorbed material was eluted with elution buffer (20 mM sodium 
20 phosphate buffer containing 1.0 M sodium chloride and 0.3 M 
imidazole pH 7.0 (0.2 L)) and absorbance at 280 nm was 
determined on a spectrophotometer. Protein content of each 
fraction was determined as described in M. Bradford, Analytical 
Biochemistry, 72 (1976) 248, and lactate dehydrogenase activity 
25 was determined as described in F. Kubowitz and P. Ott, Biochem. Z, 
314 (1943) 94. Results indicated that more than 95% of the 
lactate dehydrogenase activity was recovered in the elution 
fractions. 
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EXAMPLE 3 

Characterization of lactate dehydrog enase binding 
5 Further experiments were performed both with native 

LDH, a tetramer of about 140 kD, and a subunit of the enzyme, 
obtained after warming of the crude chicken muscle extract to 
45°C for 10 minutes. Both the tetramer and the subunit were 
allowed to associate with the immobilized Ni support, and both 
10 forms of LDH were retained. This result demonstrates that the 
retention of the LDH enzyme on immobilized Ni is not peculiar to 
the tetrameric form of the peptide; that is, binding does not 
require "cooperation" between subunits. Instead, the single 
subunit of the enzyme also had affinity for the nickel ion, and this 
15 affinity was demonstrated to be virtually identical to the affinity 
shown for the tetramer. Both the native protein and the subunit 
were adsorbed in buffer with an imidazole concentration up to 6 0 
mmol and both were eluted completely at a concentration of 300 
mmol imidazole. 

20 To ascertain that it is the polyhistidine portion of the 

LDH that provides affinity for the nickel ion, the tetramic form of 
the LDH enzyme was subjected to CNBr cleavage to produce a 
mixture of peptides. This mixture of peptides was applied to a Ni- 
IDA column with metal ion capacity of 32 mmol per mL gel. 

25 Loading conditions were the same as those used for the 
purification of the enzyme from the crude extract, described 
above. The adsorbed material was eluted with 300 mmol 
imidazole, and subjected to RPC chromatography. The 
chromatographic peak containing about 80% of the adsorbed 

13 
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material was then subjected to amino acid analysis. The results 
obtained demonstrate that this peak corresponds to the N- 
terminal peptide from LDH and that this peptide that contains the 
polyhistidine sequence. In addition, the fact that the peptide 
retained its binding affinity even after treatment with CNBr in 
presence of 70% TFA is proof that the binding is not due to a rigid 
secondary conformation structure. 



10 



EXAMPLE 4 



Pnriffoation of lactate deh ydrogenase on Co2+-TALON agarose 

Extraction of chicken breast muscle LDH was 
performed as in Example 1, and equilibrated by gel filtration on 
15 Sephadex G-25 columns with equilibration buffer (20 mM sodium 
phosphate buffer containing 1.0 M sodium chloride and 0.06 M 

imidazole pH 7.0 (1 L)). 

The IMAC column was prepared in the following 
manner: Approximately 2.75 mL of Co2+-TALON Superflow 6 

20 (Amersham, Pharmacia) was transferred to a vacuum bottle, 
diluted with the same volume of deionized water and degassed 
under vacuum for 10 minutes. The gel suspension was poured 
into a column (3x1 cm. i.d.) trapped on the bottom with a 
degassed adapter and left to settle. The column was filled to the 

25 top with degassed deionized water, and a top adapter was gently 
pushed down toward the column bed until there was no space 
between the top surface of the gel and the adapter. The column 
was washed with 3 column volumes of deionized water at a flow 
rate of 0.5 mL per min. 
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Purification of the fusion protein on Co2+-TALON 
Superflow 6 was carried out by first equilibrating the IMAC 
column with 5 to 10 column volumes of the equilibration buffer. 
The sample was then loaded on the IMAC column at a flow rate of 
5 1.0 mL per min, and 1 mL fractions were collected. The column 
was washed with the equilibration buffer until a baseline was 
reached (absorbance of the fractions at 280 nm as less than 2 
mAU higher than the absorbance of the equilibration buffer). 

The adsorbed material was eluted with elution buffer 

10 (20 mM sodium phosphate buffer containing 1.0 M sodium 
chloride and 0.3 M imidazole pH 7.0 (0.2 L)) and absorbance at 
280 nm was determined on a spectrophotometer. Protein content 
of each fraction was determined as described in M. Bradford, 
Analytical Biochemistry, 72 (1976) 248, and lactate 

15 dehydrogenase activity was determined as described in F. 
Kubowitz and P. Ott, Biochem. Z., 314 (1943) 94. As in Example 2, 
more than 95% of the lactate dehydrogenase activity was 
recovered in the elution fractions. 

20 

EXAMPLE 5 

Isolation and purification of fusion protein consisting of affinity 
peptide and Green Fluorescent Protein UV Mutant (GFPuv^ 
25 An affinity peptide/GFP fusion protein was isolated 

from E. coli cells which had been transformed with the pGFPuv.HS 
vector. Cell paste (0.39 g) was transferred to pre-cooled mortar, 
1.2 g of alumina was added, and the mixture was ground for 2 
minutes. Extraction buffer (5 mL, stored at 4°C) was added, and, 

15 
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after additional grinding for 2 minutes, the mixture was 
transferred into four eppendorph tubes. The suspension was 
added to the eppendorph tubes and centrifuged for 12 minutes at 
12,000 rpm (11,750 x g). The clear supernatant (approximately 6 
5 mL) was used as a starting sample for IMAC. 

The extraction and chromatography equilibration 
buffers consisted of 20 mM sodium phosphate buffer containing 
1.0 M sodium chloride and 5 mM imidazole pH 7.0 (1 L). The 
elution buffer for IMAC consisted of 20 mM sodium phosphate 

10 buffer containing 1.0 M sodium chloride and 150 mM imidazole 
pH 7.0 (0.2 L). 

The IMAC was carried out in the following manner: 
Approximately 2.75 mL of Co2+-TALON Superflow 6 (Amersham, 
Pharmacia) was transferred to a vacuum bottle, diluted with the 

15 same volume of deionized water and degassed under vacuum for 
10 minutes. The gel suspension was poured into a column (3x1 
cm. i.d.) trapped on the bottom with a degassed adapter and left 
to settle. The column was filled to the top with degassed 
deionized water, and a top adapter was gently pushed down 

20 toward the column bed until there was no space between the top 
surface of the gel and the adapter. The column was washed with 
3 column volumes of deionized water at a flow rate of 0.5 mL per 
min. 

Purification of the fusion protein on Co2+-TALON 
25 Superflow 6 was carried out by first equilibrating the IMAC 
column with 5 to 10 column volumes of the equilibration buffer. 
The sample was then loaded on the IMAC column at a flow rate of 
1.0 mL per min, and 1 mL fractions were collected. The column 
was washed with the equilibration buffer until a baseline was 

16 
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reached (absorbance of the fractions at 280 nm as less than 2 
mAU higher than the absorbance of the equilibration buffer). The 
adsorbed material was then eluted with elution buffer. 

Absorbance of each fraction at 280 nm was 
5 determined on a spectrophotometer; and protein content of each 
fraction was determined. Fluorescence of each fraction was 
deiermined on a microplate reader, and the purity of the fusion 
protein was determined also by SDS-electrophoresis. More than 
85% of the fusion protein was recovered in the fractions obtained. 

10 Part of the cDNA sequence, and the amino acid sequence encoded 
by this cDNA sequence, of a vector containing the affinity peptide 
at the N-tcrminus of Green Fluoresecent Protein-UV mutant 
(GFPuv) is shown in SEQ ID No. 1 and SEQ ID No. 2, respectively. 
The full cDNA sequence of a vector containing the construct of the 

15 affinity peptide at the N-terminus of GFPuv is shown in SEQ ID No. 
3. The full cDNA sequence of a vector containing part of the 
affinity peptide at the N-terminus of the enterokinase cleavage 
site and the amino acid sequence encoded by this cDNA 
corresponding to the start of translation site, the affinity peptide 

20 and the multiple cloning site are shown in SEQ ID Nos. 4 and 5. 



EXAMPLE 6 

25 Construction of fusion proteins 

A DNA sequence corresponding to the affinity peptide 
of the present invention is fused to the DNA coding sequence of a 
protein of interest. The polynucleotide sequence for the affinity 
peptide is fused most generally at or close to the DNA sequence 

17 
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coding for the N- or C-terminal amino acid of the protein of 
interest. This results in a DNA sequence which codes for a fusion 
protein comprising the affinity peptide and the protein of interest. 

In addition, a polynucleotide sequence that codes for a 
5 protein proteolytic site is incorporated into the fusion protein DNA 
sequence between the sequence for the affinity peptide and the 
sequence of the protein of interest. This type of DNA construct 
results in a fusion protein product having a proteolytic site. This 
site allows for the eventual regeneration of the protein of interest 

10 from the fusion protein by limited proteolysis and a second 
chromatography step. The second chromatography step, in which 
the product of the proteolysis is loaded onto an immobilized metal 
ion affinity column, results in the separation of the protein of 
interest from the affinity peptide. 

15 An additional embodiment of the present invention 

provides a DNA sequence coding for a polypeptide "secretion 
signal" introduced into the DNA that codes for the fusion protein. 
This secretion signal, when expressed, causes the fusion protein to 
be secreted into the culture media after the fusion protein is 

20 synthesized in the cell. Since a large number of cellular proteins 
are not transported out of the cell, isolation and purification of the 
fusion protein is enhanced as the requirements for cell disruption, 
extraction and removal of unwanted cell components are 
eliminated. 

25 The present invention is directed to a fusion protein of 

general formula Rl-(HX n ) m -R2 comprising: a protein of interest Rl 
or R2 fused at its amino terminus or carboxy terminus to at least 
one affinity peptide, said affinity peptide having a formula (HX n ) m » 
wherein H is histidine, X is an amino acid other than histidine, n= 
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1-8, m= 2-30, and wherein if n=l for more than two adjacent 
units of HX, at least one X must be asparagine, phenylalanine, 
tryptophan, tyrosine, lysine, methionine, arginine, glutamine, or 
cysteine. Preferably, n=l-4 and m=3-10. In one preferred 
5 embodiment, n=l-4 and m=3-6. Preferably, if n=l for more than 
two adjacent units of HX, only one X is asparagine, phenylalanine, 
tryptophan, tyrosine, lysine, methionine, arginine, glutamine, or 
cysteine. In one preferred embodiment, the fusion protein has a 
sequence SEQ ID No. 1. The fusion protein may contain at least 

10 one protease cleavage site between said protein of interest and 
said affinity peptide. Preferably, the affinity peptide has affinity 
for metal ions. A representative metal ion is a nickel ion. The 
fusion protein may further comprise a secretion signal sequence. 

The present invention is also directed to a DNA 

15 sequence coding for a fusion protein of general formula Rl- 
(HX n )m-R2 comprising: a protein of interest Rl or R2 fused at its 
amino terminus or carboxy terminus to at least one affinity 
peptide, said affinity peptide having a formula (HX n )m> wherein H 
is histidine, X is an amino acid other than histidine, n= 1-8, m= 2- 

20 30, and wherein if n=l for more than two adjacent units of HX, at 
least one X must be asparagine, phenylalanine, tryptophan, 
tyrosine, lysine, methionine, arginine, glutamine, or cysteine. 
Preferably, the fusion protein has the sequence shown in SEQ ID 
No:6. 

25 The present invention is also directed to a 

recombinant vector comprising a DNA sequence disclosed herein, 
wherein said recombinant vector is capable of directing 
expression of said DNA sequence for said fusion protein in a 
suitable host organism. The present invention is also directed to a 
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host organism containing a recombinant vector disclosed herein, 
wherein said organism is capable of expressing said fusion 
protein. 

The present invention is also directed to a method for 
5 purifying the fusion protein disclosed herein, comprising the steps 
of: 

contacting a protein sample containing said fusion protein in a 
mixture with other proteins with a metal chelate resin under 
conditions where said fusion protein binds to said resin to produce 

10 a resin-fusion protein complex; washing said resin-fusion protein 
complex with a buffer to remove said other, unbound proteins; 
and eluting said bound fusion protein from the washed resin- 
fusion protein complex; wherein said eluted fusion protein is 
purified. This method may further comprise the step of cleaving 

15 said protein of interest from said affinity peptide. Moreover, this 
method further comprises the step of separating said cleaved 
protein of interest from said affinity peptide using a. metal 
chelate resin under conditions where said affinity peptide binds to 
said metal of said resin and said protein of interest does not. 

20 Any patents or publications mentioned in this 

specification are indicative of the levels of those skilled in the art 
to which the invention pertains. These patents and publications 
are herein incorporated by reference to the same extent as if each 
individual publication was specifically and individually indicated 

25 to be incorporated by reference. 

One skilled in the art will readily appreciate that the 
present invention is well adapted to carry out the objects and 
obtain the ends and advantages mentioned, as well as those 
inherent therein. The present examples along with the methods, 

20 
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procedures, treatments, molecules, and specific compounds 
described herein are presently representative of preferred 
embodiments, are exemplary, and are not intended as limitations 
on the scope of the invention. Changes therein and other uses will 
5 occur to those skilled in the art which are encompassed within the 
spirit of the invention as defined by the scope of the claims. 
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WHAT IS CLAIMED IS: 

1. A fusion protein comprising: a protein of interest 
fused at its amino terminus or carboxy terminus to at least one 

5 affinity peptide, said fusion protein having a formula Rl-(HX n ) m - 
R2, wherein Rl or R2 is said protein of interest, His histidine, Xis 
an amino acid other than histidine, n= 1-8, m= 2-30, and wherein 
if n=l for more than two adjacent units of HX, at least one Xmust 
be asparagine, phenylalanine, tryptophan, tyrosine, lysine, 
10 methionine, arginine, glutamine, or cysteine. 

2. The fusion protein of claim 1, wherein n=l-4. 

15 

3. The fusion protein of claim 1, wherein m=3-10. 

4. The fusion protein of claim 1, wherein n=l-4 and 

20 m = 3-6. 

5. The fusion protein of claim 1, wherein if n=l for 
more than two adjacent units of HX, only one X is asparagine, 

25 phenylalanine, tryptophan, tyrosine, lysine, methionine, arginine, 
glutamine, or cysteine. 
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6. The fusion protein of claim 2, wherein said 
fusion protein has a sequence SEQ ID No. 6. 



5 7. The fusion protein of claim 1, wherein said 

fusion protein contains at least one protease cleavage site between 
said protein of interest and said affinity peptide. 

10 8. The fusion protein of claim 1, wherein said 

affinity peptide has affinity for metal ions. 

9. The fusion protein of claim 8, wherein said metal 
15 ions are nickel ions. 

10. The fusion protein of claim 1, wherein said 
fusion protein further comprises a secretion signal sequence. 

20 

11. A DNA sequence coding for a fusion protein 
comprising a protein of interest fused at its amino- or carboxy- 
terminus to at least one affinity peptide, said fusion protein 

25 having the formula Rl-(HX n ) m -R2, wherein Rl or R2 is said 
protein of interest, n= 1-8, m= 2-30, and wherein if n=l for more 
than two adjacent units of HX, at least one X must be asparagine, 
phenylalanine, tryptophan, tyrosine, lysine, methionine, arginine, 
glutamine, or cysteine. 

23 
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12. The DNA sequence of claim 11, wherein said 
fusion protein has the sequence shown in SEQ ID No:l. 

5 

13. A recombinant vector comprising a DNA 
sequence of claim 11, wherein said recombinant vector is capable 
of directing expression of said DNA sequence for said fusion 

10 protein in a suitable host organism. 

14. A host organism containing a recombinant vector 
of claim 13, wherein said organism is capable of expressing said 

15 fusion protein. 

15. A method for purifying the fusion protein of 
claim 1, comprising the steps of: 

20 contacting a protein sample containing said fusion 

protein in a mixture with other proteins with a metal chelate resin 
under conditions where said fusion protein binds to said resin to 
produce a resin-fusion protein complex; 

washing said resin-fusion protein complex with a 
25 buffer to remove said other, unbound proteins; and 

eluting said bound fusion protein from the washed 
resin-fusion protein complex; wherein said eluted fusion protein is 
purified. 

24 
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16. The method of claim 15, further comprising the 
step of cleaving said protein of interest from said affinity peptide. 

17. The method of claim 16, further comprising the 
step of separating said cleaved protein of interest from said 
affinity peptide using a. metal chelate resin under conditions 
where said affinity peptide binds to said metal of said resin and 
said protein of interest does not. 
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SEQUENCE LISTING 

<110> Tchaga, Grigoriy 

Jokhadze, George G. 

<120> Compositions and Methods for Protein Purification 

Based on a Novel Metal Ion Affinity Site 

<130> D6094PCT 

<140> 

<141> 1999-05-14 

<150> US 09/078,687 

<151> 1998-05-14 

<160> 6 

<210> 1 
<211> 840 
<212> DNA 

<213> artificial sequence 

<220> 

<223> Partial cDNA sequence of a vector containing the 

affinity peptide at the N-terminus of Green 
Fluorescent Protein-UV mutant (GFPuv) 

<400> 1 

atgaccatga ttacgccaag cttgtctctc aaggatcatc tcatccacaa 50 
tgtccacaaa gaggagcacg ctcatgccca caacaagatc agcgtggttg 100 
gtgtgggtgc agttggaccg gtaagtaaag gagaagaact tttcactgga 150 
gttgtcccaa ttcttgttga attagatggt gatgttaatg ggcacaaatt 200 
ttctgtcagt ggagagggtg aaggtgatgc aacatacgga aaacttaccc 250 
ttaaatttat ttgcactact ggaaaactac ctgttccatg gccaacactt 3 00 
gtcactactt tctcttatgg tgttcaatgc ttttcccgtt atccggatca 350 
tatgaaacgg catgactttt tcaagagtgc catgcccgaa ggttatgtac 400 
aggaacgcac tatatctttc aaagatgacg ggaactacaa gacgcgtgct 450 
gaagtcaagt ttgaaggtga tacccttgtt aatcgtatcg agttaaaagg 500 
tattgatttt aaagaagatg gaaacattct cggacacaaa ctcgagtaca 550 
actataactc acacaatgta tacatcacgg cagacaaaca aaagaatgga 600 
atcaaagcta acttcaaaat tcgccacaac attgaagatg gatccgttca 650 
actagcagac cattatcaac aaaatactcc aattggcgat ggccctgtcc 700 
ttttaccaga caaccattac ctgtcgacac aatctgccct ttcgaaagat 750 
cccaacgaaa agcgtgacca catggtcctt cttgagtttg taactgctgc 800 
tgggattaca catggcatgg atgagctcta caaataatga 84 0 
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<210> 2 
<211> 278 
<212> PRT 

<213> artificial sequence 

<220> 

<223> Amino acid sequence encoded by partial cDNA sequence 

of a vector containing the affinity peptide at the 
N- terminus of Green Fluorescent Protein-UV mutant (GFPu* 

<400> 2 

Met Thr Met lie Thr Pro Ser Leu Ser Leu Lys Asp His Leu lie 

5 10 15 

His Asn Val His Lys Glu Glu His Ala His Ala His Asn Lys lie 

20 25 30 

Ser Val Val Gly Val Gly Ala Val Gly Pro Val Ser Lys Gly Glu 

35 40 45 

Glu Leu Phe Thr Gly Val Val Pro He Leu Val Glu Leu Asp Gly 

50 55 60 

Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly 

65 70 75 

Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys Thr Thr 

80 85 90 

Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe Ser 

95 100 105 

Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg 

110 115 120 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 

125 130 135 

Arg Thr He Ser Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala 

140 145 150 

Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu 

155 160 165 

lys Gly He Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys 

170 175 180 

Leu Glu Tyr Asn Tyr Asn Ser His Asn Val Tyr He Thr Ala Asp 

185 190 195 

Lys Gin Lys Asn Gly He Lys Ala Asn Phe Lys He Arg His Asn 

200 205 210 

SEQ 2/7 
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lie Glu Asp Gly Ser Val Gin Leu Ala Asp His Tyr Gin Gin Asn 

215 220 225 

Thr Pro lie Gly Asp Gly Pro Val Leu Leu Pro Asp Asn His Tyr 

230 235 240 

Leu Ser Thr Gin Ser Ala Leu Ser Lys Asp Pro Asn Glu Lys Arg 

245 250 255 

Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly lie Thr 

260 265 270 

His Gly Met Asp Glu Leu Tyr Lys 

275 

<210> 3 
<211> 3384 
<212> DMA 

<213> artificial sequence 

<220> 

<223> cDNA sequence of a vector containing the construct 

of the affinity peptide at the N-terminus of GFPuv 



<400> 


3 










agcgcccaat 


acgcaaaccg 


cctctccccg 


cgcgttggcc 


gattcattaa 


50 


tgcagctggc 


acgacaggtt 


tcccgactgg 


aaagcgggca 


gtgagcgcaa 


100 


cgcaattaat 


gtgagttagc 


tcactcatta 


ggcaccccag 


gctttacact 


150 


ttatgcttcc 


ggctcgtatg 


ttgtgtggaa 


ttgtgagcgg 


ataacaattt 


200 


cacacaggaa 


acagctatga 


ccatgattac 


gccaagcttg 


tctctcaagg 


250 


atcatctcat 


ccacaatgtc 


cacaaagagg 


agcacgctca 


tgcccacaac 


300 


aagatcagcg 


tggttggtgt 


gggtgcagtt 


ggaccggtaa 


gtaaaggaga 


350 


agaacttttc 


actggagttg 


tcccaattct 


tgttgaatta 


gatggtgatg 


400 


ttaatgggca 


caaattttct 


gtcagtggag 


agggtgaagg 


tgatgcaaca 


450 


tacggaaaac 


ttacccttaa 


atttatttgc 


actactggaa 


aactacctgt 


500 


tccatggcca 


acacttgtca 


ctactttctc 


ttatggtgtt 


caatgctttt 


550 


cccgttatcc 


ggatcatatg 


aaacggcatg 


actttttcaa 


gagtgccatg 


600 


cccgaaggtt 


atgtacagga 


acgcactata 


tctttcaaag 


atgacgggaa 


650 


ctacaagacg 


cgtgctgaag 


tcaagtttga 


aggtgatacc 


cttgttaatc 


700 


gtatcgagtt 


aaaaggtatt 


gattttaaag 


aagatggaaa 


cattctcgga 


750 


cacaaactcg 


agtacaacta 


taactcacac 


aatgtataca 


tcacggcaga 


800 


caaacaaaag 


aatggaatca 


aagctaactt 


caaaattcgc 


cacaacattg 


850 


aagatggatc 


cgttcaacta 


gcagaccatt 


atcaacaaaa 


tactccaatt 


900 


ggcgatggcc 


ctgtcctttt 


accagacaac 


cattacctgt 


cgacacaatc 


950 
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tgccctttcg aaagatccca acgaaaagcg 
agtttgtaac tgctgctggg attacacatg 
taatgaattc caactgagcg ccggtcgcta 
gtcaaaaata ataggcctac tagtcggccg 
gcgtttcggt gatgacggtg aaaacctctg 
cggtcacagc ttgtctgtaa gcggatgccg 
ggcgcgtcag cgggtgttgg cgggtgtcgg 
atcagagcag attgtactga gagtgcacca 
acagatgcgt aaggagaaaa taccgcatca 
gatacgccta tttttatagg ttaatgtcat 
cgtcaggtgg cacttttcgg ggaaatgtgc 
tttttcfcaaa tacattcaaa tatgtatccg 
ataaatgctt caataatatt gaaaaaggaa 
tccgtgtcgc ccttattccc ttttttgcgg 
gctcacccag aaacgctggt gaaagtaaaa 
tgcacgagtg ggttacatcg aactggatct 
agagttttcg ccccgaagaa cgttttccaa 
ctgctatgtg gcgcggtatt atcccgtatt 
cggtcgccgc atacactatt ctcagaatga 
tcacagaaaa gcatcttacg gatggcatga 
gctgccataa ccatgagtga taacactgcg 
gatcggagga ccgaaggagc taaccgcttt 
atgtaactcg ccttgatcgt tgggaaccgg 
aacgacgagc gtgacaccac gatgcctgta 
caaactatta actggcgaac tacttactct 
tagactggat ggaggcggat aaagttgcag 
cttccggctg gctggtttat tgctgataaa 
gtctcgcggt atcattgcag cactggggcc 
tcgtagttat ctacacgacg gggagtcagg 
agacagatcg ctgagatagg tgcctcactg 
agaccaagtt tactcatata tactttagat 
aatttaaaag gatctaggtg aagatccttt 
atcccttaac gtgagttttc gttccactga 
gatcaaagga tcttcttgag atcctttttt 
tgcaaacaaa aaaaccaccg ctaccagcgg 
gagctaccaa ctctttttcc gaaggtaact 
accaaatact gtccttctag tgtagccgta 
actctgtagc accgcctaca tacctcgctc 
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tgaccacatg 


gtccttcttg 


1000 


gcatggatga 


gctctacaaa 


1050 


ccattaccaa 


cttgtctggt 


1100 


tacgggccct 


ttcgtctcgc 


1150 


acacatgcag 


ctcccggaga 


1200 


ggagcagaca 


agcccgtcag 


1250 


ggctggctta 


actatgcggc 


1300 


tatgcggtgt 


gaaataccgc 


1350 


ggcggcctta 


agggcctcgt 


1400 


gataataatg 


gtttcttaga 


1450 


gcggaacccc 


tatttgttta 


1500 


ctcatgagac 


aataaccctg 


1550 


gagtatgagt 


attcaacatt 


1600 


cattttgcct 


tcctgttttt 


1650 


gatgctgaag 


atcagttggg 


1700 


caacagcggt 


aagatccttg 


1750 


tgatgagcac 


ttttaaagtt 


1800 


gacgccgggc 


aagagcaact 


1850 


cttggttgag 


tactcaccag 


1900 


cagtaagaga 


attatgcagt 


1950 


gccaacttac 


ttctgacaac 


2000 


tttgcacaac 


atgggggatc 


2050 


agctgaatga 


agccatacca 


2100 


gcaatggcaa 


caacgttgcg 


2150 


agcttcccgg 


caacaattaa 


2200 


gaccacttct 


gcgctcggcc 


2250 


tctggagccg 


gtgagcgtgg 


2300 


agatggtaag 


ccctcccgta 


2350 


caactatgga 


tgaacgaaat 


2400 


attaagcatt 


ggtaactgtc 


2450 


tgatttaaaa 


cttcattttt 


2500 


ttgataatct 


catgaccaaa 


2550 


gcgtcagacc 


ccgtagaaaa 


2600 


tctgcgcgta 


atctgctgct 


2650 


tggtttgttt 


gccggatcaa 


2700 


ggcttcagca 


gagcgcagat 


2750 


gttaggccac 


cacttcaaga 


2800 


tgctaatcct 


gttaccagtg 


2850 
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gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg 2900 
atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 2950 
cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag 3000 
cgtgagctat gagaaagcgc cacgcttccc gaagggagaa aggcggacag 3050 
gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 3100 
cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc 3150 
tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 3200 
gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc 3250 
cttttgctca catgttcttt cctgcgttat cccctgattc tgtggataac 3300 
cgtattaccg cctttgagtg agctgatacc gctcgccgca gccgaacgac 3350 
cgagcgcagc gagtcagtga gcgaggaagc ggaa 33 84 



<210> 4 
<211> 2754 
<212> DNA 

<213> artificial sequence 

<220> 

<223> cDNA sequence of a vector containing part of the 

affinity peptide at the N-terminus of enterokinase 
cleavage site 



<400> 


4 










gacgaaaggg 


cctcgtgata 


cgcctatttt 


tataggttaa 


tgtcatgata 


50 


ataatggttt 


cttagacgtc 


aggtggcact 


tttcggggaa 


atgtgcgcgg 


100 


aacccctatt 


tgtttatttt 


tctaaataca 


ttcaaatatg 


tatccgctca 


150 


tgagacaata 


accctgataa 


atgcttcaat 


aatattgaaa 


aaggaagagt 


200 


atgagtattc 


aacatttccg 


tgtcgccctt 


attccctttt 


ttgcggcatt 


250 


ttgccttcct 


gtttttgctc 


acccagaaac 


gctggtgaaa 


gtaaaagatg 


300 


ctgaagatca 


gttgggtgca 


cgagtgggtt 


acatcgaact 


ggatctcaac 


350 


agcggtaaga 


tccttgagag 


ttttcgcccc 


gaagaacgtt 


ttccaatgat 


400 


gagcactttt 


aaagttctgc 


tatgtggcgc 


ggtattatcc 


cgtattgacg 


450 


ccgggcaaga 


gcaactcggt 


cgccgcatac 


actattctca 


gaatgacttg 


500 


gttgagtact 


caccagtcac 


agaaaagcat 


cttacggatg 


gcatgacagt 


550 


aagagaatta 


tgcagtgctg 


ccataaccat 


gagtgataac 


actgcggcca 


600 


acttacttct 


gacaacgatc 


ggaggaccga 


aggagctaac 


cgcttttttg 


650 


cacaacatgg 


gggatcatgt 


aactcgcctt 


gatcgttggg 


aaccggagct 


700 


gaatgaagcc 


ataccaaacg 


acgagcgtga 


caccacgatg 


cctgtagcaa 


750 


tggcaacaac 


gttgcgcaaa 


ctattaactg 


gcgaactact 


tactctagct 


800 
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tcccggcaac aattaataga ctggatggag gcggataaag ttgcaggacc 850 
acttctgcgc tcggcccttc cggctggctg gtttattgct gataaatctg 900 
gagccggtga gcgtgggtct cgcggtatca ttgcagcact ggggccagat 950 
ggtaagccct cccgtatcgt agttatctac acgacgggga gtcaggcaac 1000 
tatggatgaa cgaaatagac agatcgctga gataggtgcc tcactgatta 1050 
agcattggta actgtcagac caagtttact catatatact ttagattgat 1100 
ttaaaacttc atttttaatt taaaaggatc taggtgaaga tcctttttga 1150 
taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt 1200 
cagaccccgt agaaaagatc aaaggatctt cttgagatcc tttttttctg 1250 
cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt 1300 
ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct 1350 
tcagcagagc gcagatacca aatactgtcc ttctagtgta gccgtagtta 1400 
ggccaccact tcaagaactc tgtagcaccg cctacatacc tcgctctgct 1450 
aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg 1500 
ggttggactc aagacgatag ttaccggata aggcgcagcg gtcgggctga 1550 
acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga 1600 
actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag 1650 
ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag 1700 
cgcacgaggg agcttccagg gggaaacgcc tggtatcttt atagtcctgt 1750 
cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag 1800 
gggggcggag cctatggaaa aacgccagca acgcggcctt tttacggttc 1850 
ctggcctttt gctggccttt tgctcacatg ttctttcctg cgttatcccc 1900 
tgattctgtg gataaccgta ttaccgcctt tgagtgagct gataccgctc 1950 
gccgcagccg aacgaccgag cgcagcgagt cagtgagcga ggaagcggaa 2000 
gagcgcccaa tacgcaaacc gcctctcccc gcgcgttggc cgattcatta 2050 
atgcagctgg cacgacaggt ttcccgactg gaaagcgggc agtgagcgca 2100 
acgcaattaa tgtgagttag ctcactcatt aggcacccca ggctttacac 2150 
tttatgcttc cggctcgtat gttgtgtgga attgtgagcg gataacaatt 2200 
tcacacagga aacagctatg accatgatta cgccaagctt gaaggatcat 2250 
ctcatccaca atgtccacaa agaggagcac gctcatgccc acaacaagat 2300 
cgatgacgat gacaaagtcg acggatcccc gggtaccgag ctcgtaatta 2350 
gctgagaatt cactggccgt cgttttacaa cgtcgtgact gggaaaaccc 2400 
tggcgttacc caacttaatc gccttgcagc acatccccct ttcgccagct 2450 
ggcgtaatag cgaagaggcc cgcaccgatc gcccttccca acagttgcgc 2500 
agcctgaatg gcgaatggcg cctgatgcgg tattttctcc ttacgcatct 2550 
gtgcggtatt tcacaccgca tatggtgcac tctcagtaca atctgctctg 2600 
atgccgcata gttaagccag ccccgacacc cgccaacacc cgctgacgcg 2650 
ccctgacggg cttgtctgct cccggcatcc gcttacagac aagctgtgac 2700 
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cgtctccggg agctgcatgt gtcagaggtt ttcaccgtca tcaccgaaac 2750 
gcgc 2754 



<210> 
<211> 
<212> 
<213> 
<220> 
<223> 



5 

45 
PRT 

artificial sequence 



Amino acid sequence encoded by cDNA corresponding 
to the start of translation site, the affinity 
peptide and the multiple cloning site 
<400> 5 

Met Thr Met lie Thr Pro Ser Leu Lys Asp His Leu lie His Asn 

5 10 15 

Val His Lys Glu Glu His Ala His Ala His Asn Lys He Asp Asp 

20 25 30 

Asp Asp Lys Val Asp Gly Ser Pro Gly Thr Glu Leu Val lie Ser 

35 40 45 



<210> 6 
<211> 32 
<212> PRT 

<213> artificial sequence 

<220> 

<223> Amino acid sequence of fusion protein where n=l-4 

for the affinity peptide 
<400> 6 

Ser Leu Lys Asp His Leu He His Asn Val His Lys Glu Glu His 

5 10 15 

Ala His Ala His Asn Lys lie Ser Val Val Gly Val Gly Ala Val 

20 25 30 

Gly Met 
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